Instructors:
Dan Kersten (kersten@umn.edu)
Paul Schrater (schrater@umn.edu)
This seminar will cover state-of-the-art computational models of human visual processing, incorporating evidence from primate neurophysiology, human neuroimaging, and psychophysics. We will learn how probabilistic modeling can provide a common framework linking computational theory to neural networks for a range of visual behaviors including: detection, perceptual integration, object learning and recognition, decision making, and adaptive dynamic behaviors. The class format will consist of a combination of short lectures to provide overviews of upcoming themes, followed by discussion of journal articles led by seminar participants.
Meeting time: Wednesdays 2 to 3:30 pm
Place: Elliott S204
Note: Additional background material can be found in a related course offered Fall 2013 by Alan Yuille at: http://www.stat.ucla.edu/~yuille/courses/Stat271-Fall13/Stat_271.html
Week |
Topics |
Background material | Discussion papers |
1 | Introduction |
Preview lecture 1 | |
2. Jan 29 | Architecture of Vision |
Kersten, D. J., & Yuille, A. L. (in press). Inferential Models of the Visual Cortical Hierarchy. In M. Gazzaniga (Ed.), The New Cognitive Neurosciences, 5th Edition (pp. 1–22). MIT Press. (pdf) Markov, N. T., Vezoli, J., Chameau, P., Falchier, A., Quilodran, R., Huissoud, C., et al. (2014). The anatomy of hierarchy: Feedforward and feedback pathways in macaque visual cortex. J Comp Neurol. doi:10.1002/cne.23458
|
Olshausen, B. A. (in press). Perception as an Inference Problem. In M. Gazzaniga (Ed.), The New Cognitive Neurosciences, 5th Edition (pp. 1–22). MIT Press. (pp. 1–18). MIT Press. (pdf) Markov, N. T., & Kennedy, H. (2013). The importance of being hierarchical. Current Opinion in Neurobiology, 1–8. doi:10.1016/j.conb.2012.12.008 (pdf) Poggio, T., & Ullman, S. (2013). Vision: are models of object recognition catching up with the brain? Annals of the New York Academy of Sciences, 1–11. doi:10.1111/nyas.12148 (pdf) |
3. Feb 5 | Probabilistic models of neurons |
Modeling Neurons by Probabilities. Yuille lecture 3 notes Carandini, M. (2012). From circuits to behavior: a bridge too far? Nature Publishing Group, 15(4), 507–509. doi:10.1038/nn.3043 (pdf) Talebi, V., & Baker, C. L. (2012). Natural versus Synthetic Stimuli for Estimating Receptive Field Models: A Comparison of Predictive Robustness. Journal of Neuroscience, 32(5), 1560–1576. (pdf) Ayzenshtat, I., Gilad, A., Zurawel, G., & Slovin, H. (2012). Population Response to Natural Images in the Primary Visual Cortex Encodes Local Stimulus Attributes and Perceptual Processing. Journal of Neuroscience, 32(40), 13971–13986. doi:10.1523/JNEUROSCI.1596-12.2012 (pdf) Movshon/Seung brain mapping debate.
|
Poirazi, P., Brannon, T., & Mel, B. W. (2003). Pyramidal neuron as two-layer neural network. Neuron, 37(6), 989–999. (pdf) Gollisch, T., & Meister, M. (2010). Eye Smarter than Scientists Believed: Neural Computations in Circuits of the Retina. Neuron, 65(2), 150–164. doi:10.1016/j.neuron.2009.12.009 (pdf) |
4. Feb 12 | Models of neural interactions |
Neural interactions as Markov Random Fields models (Yuille lecture 4 notes)
Modeling local neural interactions: Pillow, J. (2007). Likelihood-based approaches to modeling the neural code. In K. Doya, S. Ishii, A. Pouget, & R. Rao (Eds.), Bayesian Brain: Probablistic Approaches to Neural Coding (pp. 53–70). MIT Press Cambridge, MA. (pdf) Heeger, D. J., Simoncelli, E. P., & Movshon, J. A. (1996). Computational models of cortical visual processing. Proceedings of the National Academy of Sciences of the United States of America, 93(2), 623–627. (pdf)
Perception of regions, local similarity, surface interpolation and texture: Allred, S. R., & Brainard, D. H. (2013). A Bayesian model of lightness perception that incorporates spatial variation in the illumination. Journal of Vision, 13(7), 18–18. doi:10.1167/13.7.18 (pdf)
Barron, J., & Malik, J. (2013). Shape, Illumination, and Reflectance from Shading (No. Technical Report UCB/EECS-2013-117). Electrical Engineering and Computer Sciences University of California at Berkeley. (pdf) Freeman, J., Ziemba, C. M., Heeger, D. J., Simoncelli, E. P., & Movshon, J. A. (2013). A functional and perceptual signature of the second visual area in primates. Nature Neuroscience. doi:10.1038/nn.3402 (pdf) Zhu, S. C., Wu, Y., & Mumford, D. (1998). Filters, random fields and maximum entropy (FRAME): Towards a unified theory for texture modeling. International Journal of Computer Vision, 27(2), 107–126. (pdf)
|
Carandini, M., & Heeger, D. J. (2011). Normalization as a canonical neural computation. Nature Reviews Neuroscience. doi:10.1038/nrn3136 (pdf)
Nakayama, K., & Shimojo, S. (1992). Experiencing and perceiving visual surfaces. Science, 257(5075), 1357–1363. doi:10.1126/science.1529336 (pdf) |
5. Feb 19 | Learning network parameters |
Berkes, P., Orban, G., Lengyel, M., & Fiser, J. (2011). Spontaneous cortical activity reveals hallmarks of an optimal internal model of the environment. Science, 331(6013), 83–87. doi:10.1126/science.1195870 (pdf) Deep-belief networks: Hinton, G. E., Osindero, S., & Teh, Y.-W. (2006). A fast learning algorithm for deep belief nets. Neural Computation, 18(7), 1527–1554. (pdf) doi:10.1162/neco.2006.18.7.1527
Zeiler, M. D., Krishnan, D., Taylor, G. W., & Fergus, R. (2010). Deconvolutional networks, 2528–2535. doi:10.1109/CVPR.2010.5539957 (pdf) Hinton, G. E. (2007). Learning multiple layers of representation. Trends in Cognitive Sciences, 11(10), 428–434. doi:10.1016/j.tics.2007.09.004 (pdf)
|
Hinton, G. E. (2009). Learning to represent visual input. Philosophical Transactions of the Royal Society B: Biological Sciences, 365(1537), 177–184. doi:10.1098/rstb.2009.0200 (pdf) Mean Field Theory: See (Yuille lecture 4 notes) from |
6. Feb 26 | Sparse coding and V1 |
Natural image coding & back-pocket models: Olshausen, B. A. (2003). Learning sparse, overcomplete representations of time-varying natural images. Image Processing, 2003. ICIP 2003. Proceedings. 2003 International Conference on, 1, I–41–4 vol. 1. Rigamonti, R., Brown, M. A., & Lepetit, V. (2011). Are sparse representations really relevant for image classification?, 1545–1552. Dictionaries, super-pixels & patches: http://www.stat.ucla.edu/~yuille/courses/Stat238-Winter12/Superpixels.pdf
Sharon, E., Galun, M., Sharon, D., Basri, R., & Brandt, A. (2006). Hierarchy and adaptivity in segmenting visual scenes. Nature, 442(7104), 810–813. doi:10.1038/nature04977 Intrinsic image properties & V1?
Ayzenshtat, I., Gilad, A., Zurawel, G., & Slovin, H. (2012). Population Response to Natural Images in the Primary Visual Cortex Encodes Local Stimulus Attributes and Perceptual Processing. Journal of Neuroscience, 32(40), 13971–13986. doi:10.1523/JNEUROSCI.1596-12.2012 (pdf) Boyaci, H., Fang, F., Murray, S. O., & Kersten, D. (2007). Responses to Lightness Variations in Early Human Visual Cortex. Current Biology, 17(11), 989–993. doi:10.1016/j.cub.2007.05.005
|
Hyvärinen, A. (2010). Statistical Models of Natural Images and Cortical Visual Representation. Topics in Cognitive Science, 2(2), 251–264. doi:10.1111/j.1756-8765.2009.01057.x (pdf) Bell, A. J., & Sejnowski, T. J. (1996). Learning the higher-order structure of a natural sound. Network: Computation in Neural Systems, 7(2), 261-266.(pdf) Ren, X., & Malik, J. (2003). Learning a classification model for segmentation, 10–17. (pdf)
Yan, X., Khambhati, A., Liu, L., & Lee, T. S. (2012). Journal of Physiology - Paris. Journal of Physiology-Paris, 106(5-6), 250–265. doi:10.1016/j.jphysparis.2012.08.006 (pdf) |
7. Mar 5 | Image parsing |
Heydt, von der, R. (2003). Image parsing mechanisms of the visual cortex. The Visual Neurosciences, 1139–1150. Fang, F., Boyaci, H., & Kersten, D. (2009). Border ownership selectivity in human early visual cortex and its modulation by attention. Journal of Neuroscience, 29(2), 460–465. doi:10.1523/JNEUROSCI.4628-08.2009 Lee, T. S., & Yuille, A. L. (2006). Efficient coding of visual scenes by grouping and segmentation: theoretical predictions and biological evidence. In K. Doya, S. Ishii, A. Pouget, & R. P. N. Rao (Eds.), Bayesian Brain: Probabilistic Approaches to Neural Coding (pp. 145–188). MIT Press, Cambridge, MA.
|
Ren, X., & Malik, J. (2003). Learning a classification model for segmentation, 10–17. (pdf) Sharon, E., Galun, M., Sharon, D., Basri, R., & Brandt, A. (2006). Hierarchy and adaptivity in segmenting visual scenes. Nature, 442(7104), 810–813. doi:10.1038/nature04977 (pdf)
Heydt, von der, R. (2003). Image parsing mechanisms of the visual cortex. The Visual Neurosciences, 1139–1150 (pdf) Lee, T. S., & Yuille, A. L. (2006). Efficient coding of visual scenes by grouping and segmentation: theoretical predictions and biological evidence. In K. Doya, S. Ishii, A. Pouget, & R. P. N. Rao (Eds.), Bayesian Brain: Probabilistic Approaches to Neural Coding (pp. 145–188). MIT Press, Cambridge, MA. (pdf) |
8. Mar 12 | Compositional models: Inference |
Jin, Y., & Geman, S. (2006). Context and hierarchy in a probabilistic image model. Computer Vision and Pattern Recognition, 2006 IEEE Computer Society Conference on, 2, 2145–2152. (pdf) Zhu, L., Chen, Y., & Yuille, A. (2011). Recursive Compositional Models for Vision: Description and Review of Recent Work. Journal of Mathematical Imaging and Vision, 41(1-2), 122–146. (pdf) Ommer, B., & Buhmann, J. M. (2012). Learning the compositional nature of visual objects. Computer Vision and Pattern Recognition, 2007. CVPR'07. IEEE Conference on, 1–8. (pdf) Feldman, J. (2009). Bayes and the simplicity principle in perception. Psychological Review, 116(4), 875–887. (pdf) Kravitz, D. J., Kriegeskorte, N., & Baker, C. I. (2010). High-Level Visual Object Representations Are Constrained by Position. Cerebral Cortex, 20(12), 2916–2925. doi:10.1093/cercor/bhq042 (pdf) |
Barenholtz, E., & Tarr, M. J. (2008). Visual judgment of similarity across shape transformations: Evidence for a compositional model of articulated objects. Acta Psychologica, 128(2), 331–338. (pdf)
Orbán, G., Fiser, J., Aslin, R. N., & Lengyel, M. (2008). Bayesian learning of visual chunks by human observers. Proceedings of the National Academy of Sciences of the United States of America, 105(7), 2745. (pdf) |
Mar 19 | Spring Break |
||
9. Mar 26 | Compositional models: Learning |
Hegdé, J., Thompson, S., Brady, M., & Kersten, D. (2012). Object Recognition in Clutter: Cortical Responses Depend on the Type of Learning. Frontiers in Human Neuroscience. doi:10.3389/fnhum.2012.00170 (pdf) Epshtein, B., Lifshitz, I., & Ullman, S. (2008). Image interpretation by a single bottom-up top-down cycle. Proceedings of the National Academy of Sciences of the United States of America, 105(38), 14298. (pdf) Tenenbaum, J. B., Kemp, C., Griffiths, T. L., & Goodman, N. D. (2011). How to Grow a Mind: Statistics, Structure, and Abstraction. Science, 331(6022), 1279–1285. doi:10.1126/science.1192788 (pdf) Anselmi, F., Leibo, J. Z., Rosasco, L., Mutch, J., Tacchetti, A., & Poggio, T. (2014). Unsupervised learning of invariant representations with low sample complexity: the magic of sensory cortex or a new framework for machine learning? (No. CBMM Memo No. 001) (pp. 1–23). Center for Brains, Minds & Machines. (pdf)
|
Yuille, A. (2011). Towards a theory of compositional learning and encoding of objects. Computer Vision Workshops (ICCV Workshops), 2011 IEEE International Conference on, 1448–1455. (pdf) Salakhutdinov, R., Tenenbaum, J., & Torralba, A. (2012). Learning with Hierarchical-Deep Models. doi:10.1109/TPAMI.2012.269 (pdf) Also, take a look at the video: How to Grow a Mind: Statistics, Structure and Abstraction
|
10. Apr 2 | Information combination & neural population codes |
Fetsch, C. R., Pouget, A., DeAngelis, G. C., & Angelaki, D. E. (2011). Neural correlates of reliability-based cue weighting during multisensory integration. Nature Publishing Group, 15(1), 146–154. doi:10.1038/nn.2983 (pdf) Fetsch, C. R., DeAngelis, G. C., & Angelaki, D. E. (2013). Bridging the gap between theoriesof sensory cue integration and thephysiology of multisensory neurons, Nature Reviews: Neuroscience,1–14. doi:10.1038/nrn3503 (pdf)
|
Pouget, A., Beck, J. M., Ma, W. J., & Latham, P. E. (2013). Probabilistic brains: knowns and unknowns. Nature Neuroscience. doi:10.1038/nn.3495 (pdf)
van Atteveldt, N., Murray, M. M., Thut, G., & Schroeder, C. E. (2014). Multisensory Integration:Flexible Use of General Operations. Neuron, 81(6), 1240–1253. doi:10.1016/j.neuron.2014.02.044 (pdf)
|
11. Apr 9 | Sampling models |
Sundareswara, R., & Schrater, P. R. (2008). Perceptual multistability predicted by search model for Bayesian decisions. Journal of Vision, 8(5), 12–12. (pdf) Moreno-Bote, R., Knill, D. C., & Pouget, A. (2011). Bayesian sampling in visual perception. Proceedings of the National Academy of Sciences of the United States of America, 108(30), 12491–12496. doi:10.1073/pnas.1101430108 (pdf) Battaglia, P. W., Kersten, D., & Schrater, P. R. (2011). How haptic size sensations improve distance perception. PLoS Computational Biology, 7(6), e1002080. doi:10.1371/journal.pcbi.1002080 Gershman, S. J., Vul, E., & Tenenbaum, J. B. (2012). Multistability and perceptual inference. Neural Computation, 24(1), 1–24. Beck, J. M., Ma, W. J., Pitkow, X., Latham, P. E., & Pouget, A. (2012). Not noisy, just wrong: the role of suboptimal inference in behavioral variability. Neuron, 74(1), 30–39. doi:10.1016/j.neuron.2012.03.016 Berkes, P., Orban, G., Lengyel, M., & Fiser, J. (2011). Spontaneous cortical activity reveals hallmarks of an optimal internal model of the environment. Science, 331(6013), 83–87. doi:10.1126/science.1195870 (pdf) (See also Week 5). |
For background see: http://pillowlab.cps.utexas.edu/teaching/TSNC13/slides/slides23_SamplingAndMCMC.pdf Sundareswara, R., & Schrater, P. R. (2008). Perceptual multistability predicted by search model for Bayesian decisions. Journal of Vision, 8(5), 12–12. (pdf) Fiser, J., Berkes, P., Orban, G., & Lengyel, M. (2010). Statistically optimal perception and learning: from behavior to neural representations. Trends in Cognitive Sciences, 14(3), 119–130. doi:10.1016/j.tics.2010.01.003 (pdf)
|
12. Apr 16 | Learning, reinforcement & perception |
Gershman, S. J., & Niv, Y. (2010). Learning latent structure: carving nature at its joints. Current Opinion in Neurobiology, 20(2), 251–256. doi:10.1016/j.conb.2010.02.008 |
Bavelier, D., Green, C. S., Pouget, A., & Schrater, P. (2012). Brain Plasticity Through the Life Span: Learning to Learn and Action Video Games. Annual Review of Neuroscience, 35(1), 391–416. (pdf) Acuña, D. E., & Schrater, P. (2010). Structure learning in human sequential decision-making. PLoS Computational Biology, 6(12), e1001003. doi:10.1371/journal.pcbi.1001003.g001 (pdf) |
13. Apr 23 | Decision making. |
Yang, T., & Shadlen, M. N. (2007). Probabilistic reasoning by neurons. Nature, 447(7148), 1075–1080. doi:10.1038/nature05852 (pdf) Srivastava, N., & Schrater, P. R. (2011). A predictive model for self-motivated decision-making behavior. Proceedings of BRIMS. |
Dayan, P., & Daw, N. D. (2008). Decision theory, reinforcement learning, and the brain. Cognitive, Affective, & Behavioral Neuroscience, 8(4), 429–453. doi:10.3758/CABN.8.4.429 (pdf)
Huang, Y., & Rao, R. P. N. (2013). Reward Optimization in the Primate Brain: A Probabilistic Model of Decision Making under Uncertainty. PLoS ONE, 8(1), e53344. doi:10.1371/journal.pone.0053344.t001 (pdf)
|
14. Apr 30 | Resource allocation and attention |
Poort, J., Raudies, F., Wannig, A., Lamme, V. A. F., Neumann, H., & Roelfsema, P. R. (2012). The Role of Attention in Figure-Ground Segregation in Areas V1 and V4 of the Visual Cortex. Neuron, 75(1), 143–156. doi:10.1016/j.neuron.2012.04.032 Fulvio, J. M., Green, C. S., & Schrater, P. R. (2014). Task-Specific Response Strategy Selection on the Basis of Recent Training Experience. PLoS Computational Biology, 10(1), e1003425. doi:10.1371/journal.pcbi.1003425.s010 Ullman, S. (1995). Sequence seeking and counter streams: a computational model for bidirectional information flow in the visual cortex. Cerebral Cortex, 5(1), 1–11. Borji, A., & Itti, L. (2013). State-of-the-art in visual attention modeling. IEEE Transactions on Pattern Analysis and Machine Intelligence, 35(1), 185–207. doi:10.1109/TPAMI.2012.89
|
Ma, W. J., Husain, M., & Bays, P. M. (2014). Changing concepts of working memory. Nature Publishing Group, 17(3), 347–356. doi:10.1038/nn.3655 (pdf)
Vul, E., Alvarez, G., Tenenbaum, J. B., & Black, M. J. (2009). Explaining human multiple object tracking as resource-constrained approximate inference in a dynamic probabilistic model, NIPS, 1955–1963. (pdf) |
15. May 7 | Forward models and physics |
Battaglia, P. W., Hamrick, J. B., & Tenenbaum, J. B. (2013). Simulation as an engine of physical scene understanding. Proceedings of the National Academy of Sciences of the United States of America, 110(45), 18327–18332. doi:10.1073/pnas.1306572110 | Wolpert, D.M., Doya, K., and Kawato, M. (2003). A unifying computational framework for motor control and social interaction. Philos Trans R Soc Lond B Biol Sci 358, 593-602. (pdf)
Battaglia, P. W., Hamrick, J. B., & Tenenbaum, J. B. (2013). Simulation as an engine of physical scene understanding. Proceedings of the National Academy of Sciences of the United States of America, 110(45), 18327–18332. doi:10.1073/pnas.1306572110 (pdf)
Also, take a look at Daniel Wolpert's TED talk: "The real reason for brains"
|
Extra topics: Temporal hierarchies |
Hasson, U., Yang, E., Vallines, I., Heeger, D. J., & Rubin, N. (2008). A Hierarchy of Temporal Receptive Windows in Human Cortex. Journal of Neuroscience, 28(10), 2539–2550. doi:10.1523/JNEUROSCI.5487-07.2008 Lerner, Y., Honey, C. J., Silbert, L. J., & Hasson, U. (2011). Topographic Mapping of a Hierarchy of Temporal Receptive Windows Using a Narrated Story. Journal of Neuroscience, 31(8), 2906–2915. doi:10.1523/JNEUROSCI.3684-10.2011 Körding, K. P., & Tenenbaum, J. (2007). Multiple timescales and uncertainty in motor adaptation. Advances in Neural Information Processing. |
||